Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Nucleic Acids Res ; 52(D1): D61-D66, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37971305

RESUMEN

The Cistrome Data Browser is a resource of ChIP-seq, ATAC-seq and DNase-seq data from humans and mice. It provides maps of the genome-wide locations of transcription factors, cofactors, chromatin remodelers, histone post-translational modifications and regions of chromatin accessible to endonuclease activity. Cistrome DB v3.0 contains approximately 45 000 human and 44 000 mouse samples with about 32 000 newly collected datasets compared to the previous release. The Cistrome DB v3.0 user interface is implemented as a single page application that unifies menu driven and data driven search functions and provides an embedded genome browser, which allows users to find and visualize data more effectively. Users can find informative chromatin profiles through keyword, menu, and data-driven search tools. Browser search functions can predict the regulators of query genes as well as the cell type and factor dependent functionality of potential cis-regulatory elements. Cistrome DB v3.0 expands the display of quality control statistics, incorporates sequence logos into motif enrichment displays and includes more expansive sample metadata. Cistrome DB v3.0 is available at http://db3.cistrome.org/browser.


Asunto(s)
Cromatina , Bases de Datos de Proteínas , Genómica , Programas Informáticos , Animales , Humanos , Ratones , Cromatina/genética , Histonas/genética , Histonas/metabolismo , Análisis de Secuencia de ADN , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Visualización de Datos , Internet , Genómica/métodos
2.
Artículo en Inglés | MEDLINE | ID: mdl-38074525

RESUMEN

Latent vectors extracted by machine learning (ML) are widely used in data exploration (e.g., t-SNE) but suffer from a lack of interpretability. While previous studies employed disentangled representation learning (DRL) to enable more interpretable exploration, they often overlooked the potential mismatches between the concepts of humans and the semantic dimensions learned by DRL. To address this issue, we propose Drava, a visual analytics system that supports users in 1) relating the concepts of humans with the semantic dimensions of DRL and identifying mismatches, 2) providing feedback to minimize the mismatches, and 3) obtaining data insights from concept-driven exploration. Drava provides a set of visualizations and interactions based on visual piles to help users understand and refine concepts and conduct concept-driven exploration. Meanwhile, Drava employs a concept adaptor model to fine-tune the semantic dimensions of DRL based on user refinement. The usefulness of Drava is demonstrated through application scenarios and experimental validation.

5.
EBioMedicine ; 92: 104581, 2023 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-37121095

RESUMEN

BACKGROUND: Rheumatoid arthritis (RA) shares genetic variants with other autoimmune conditions, but existing studies test the association between RA variants with a pre-defined set of phenotypes. The objective of this study was to perform a large-scale, systemic screen to determine phenotypes that share genetic architecture with RA to inform our understanding of shared pathways. METHODS: In the UK Biobank (UKB), we constructed RA genetic risk scores (GRS) incorporating human leukocyte antigen (HLA) and non-HLA risk alleles. Phenotypes were defined using groupings of International Classification of Diseases (ICD) codes. Patients with an RA code were excluded to mitigate the possibility of associations being driven by the diagnosis or management of RA. We performed a phenome-wide association study, testing the association between the RA GRS with phenotypes using multivariate generalized estimating equations that adjusted for age, sex, and first five principal components. Statistical significance was defined using Bonferroni correction. Results were replicated in an independent cohort and replicated phenotypes were validated using medical record review of patients. FINDINGS: We studied n = 316,166 subjects from UKB without evidence of RA and screened for association between the RA GRS and n = 1317 phenotypes. In the UKB, 20 phenotypes were significantly associated with the RA GRS, of which 13 (65%) were immune mediated conditions including polymyalgia rheumatica, granulomatosis with polyangiitis (GPA), type 1 diabetes, and multiple sclerosis. We further identified a novel association in Celiac disease where the HLA and non-HLA alleles had strong associations in opposite directions. Strikingly, we observed that the non-HLA GRS was exclusively associated with greater risk of the validated conditions, suggesting shared underlying pathways outside the HLA region. INTERPRETATION: This study replicated and identified novel autoimmune phenotypes verified by medical record review that share immune pathways with RA and may inform opportunities for shared treatment targets, as well as risk assessment for conditions with a paucity of genomic data, such as GPA. FUNDING: This research was funded by the US National Institutes of Health (P30AR072577, R21AR078339, R35GM142879, T32AR007530) and the Harold and DuVal Bowen Fund.


Asunto(s)
Artritis Reumatoide , Predisposición Genética a la Enfermedad , Humanos , Genotipo , Artritis Reumatoide/diagnóstico , Artritis Reumatoide/genética , Factores de Riesgo , Fenotipo , Antígenos HLA/genética , Antígenos de Histocompatibilidad Clase II/genética , Cadenas HLA-DRB1/genética , Alelos
6.
Bioinformatics ; 39(2)2023 02 03.
Artículo en Inglés | MEDLINE | ID: mdl-36688700

RESUMEN

SUMMARY: The regulation of genes by cis-regulatory elements (CREs) is complex and differs between cell types. Visual analysis of large collections of chromatin profiles across diverse cell types, integrated with computational methods, can reveal meaningful biological insights. We developed Cistrome Explorer, a web-based interactive visual analytics tool for exploring thousands of chromatin profiles in diverse cell types. Integrated with the Cistrome Data Browser database which contains thousands of ChIP-seq, DNase-seq and ATAC-seq samples, Cistrome Explorer enables the discovery of patterns of CREs across cell types and the identification of transcription factor binding underlying these patterns. AVAILABILITY AND IMPLEMENTATION: Cistrome Explorer and its source code are available at http://cisvis.gehlenborglab.org/ and released under the MIT License. Documentation can be accessed via http://cisvis.gehlenborglab.org/docs/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Cromatina , Epigenómica , Análisis de Secuencia de ADN , Secuenciación de Inmunoprecipitación de Cromatina , Programas Informáticos , Bases de Datos Genéticas
7.
Bioinformatics ; 39(1)2023 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-36688709

RESUMEN

SUMMARY: Gos is a declarative Python library designed to create interactive multiscale visualizations of genomics and epigenomics data. It provides a consistent and simple interface to the flexible Gosling visualization grammar. Gos hides technical complexities involved with configuring web-based genome browsers and integrates seamlessly within computational notebooks environments to enable new interactive analysis workflows. AVAILABILITY AND IMPLEMENTATION: Gos is released under the MIT License and available on the Python Package Index (PyPI). The source code is publicly available on GitHub (https://github.com/gosling-lang/gos), and documentation with examples can be found at https://gosling-lang.github.io/gos.


Asunto(s)
Biología Computacional , Gansos , Animales , Genómica , Genoma , Biblioteca de Genes , Programas Informáticos
8.
IEEE Trans Vis Comput Graph ; 29(1): 570-580, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36191105

RESUMEN

Interpretation of genomics data is critically reliant on the application of a wide range of visualization tools. A large number of visualization techniques for genomics data and different analysis tasks pose a significant challenge for analysts: which visualization technique is most likely to help them generate insights into their data? Since genomics analysts typically have limited training in data visualization, their choices are often based on trial and error or guided by technical details, such as data formats that a specific tool can load. This approach prevents them from making effective visualization choices for the many combinations of data types and analysis questions they encounter in their work. Visualization recommendation systems assist non-experts in creating data visualization by recommending appropriate visualizations based on the data and task characteristics. However, existing visualization recommendation systems are not designed to handle domain-specific problems. To address these challenges, we designed GenoREC, a novel visualization recommendation system for genomics. GenoREC enables genomics analysts to select effective visualizations based on a description of their data and analysis tasks. Here, we present the recommendation model that uses a knowledge-based method for choosing appropriate visualizations and a web application that enables analysts to input their requirements, explore recommended visualizations, and export them for their usage. Furthermore, we present the results of two user studies demonstrating that GenoREC recommends visualizations that are both accepted by domain experts and suited to address the given genomics analysis problem. All supplemental materials are available at https://osf.io/y73pt/.


Asunto(s)
Gráficos por Computador , Visualización de Datos , Genómica/métodos , Programas Informáticos
9.
J Biomed Inform ; 134: 104176, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-36007785

RESUMEN

OBJECTIVE: For multi-center heterogeneous Real-World Data (RWD) with time-to-event outcomes and high-dimensional features, we propose the SurvMaximin algorithm to estimate Cox model feature coefficients for a target population by borrowing summary information from a set of health care centers without sharing patient-level information. MATERIALS AND METHODS: For each of the centers from which we want to borrow information to improve the prediction performance for the target population, a penalized Cox model is fitted to estimate feature coefficients for the center. Using estimated feature coefficients and the covariance matrix of the target population, we then obtain a SurvMaximin estimated set of feature coefficients for the target population. The target population can be an entire cohort comprised of all centers, corresponding to federated learning, or a single center, corresponding to transfer learning. RESULTS: Simulation studies and a real-world international electronic health records application study, with 15 participating health care centers across three countries (France, Germany, and the U.S.), show that the proposed SurvMaximin algorithm achieves comparable or higher accuracy compared with the estimator using only the information of the target site and other existing methods. The SurvMaximin estimator is robust to variations in sample sizes and estimated feature coefficients between centers, which amounts to significantly improved estimates for target sites with fewer observations. CONCLUSIONS: The SurvMaximin method is well suited for both federated and transfer learning in the high-dimensional survival analysis setting. SurvMaximin only requires a one-time summary information exchange from participating centers. Estimated regression vectors can be very heterogeneous. SurvMaximin provides robust Cox feature coefficient estimates without outcome information in the target population and is privacy-preserving.


Asunto(s)
Algoritmos , Registros Electrónicos de Salud , Humanos , Privacidad , Modelos de Riesgos Proporcionales , Análisis de Supervivencia
10.
NPJ Digit Med ; 5(1): 81, 2022 Jun 29.
Artículo en Inglés | MEDLINE | ID: mdl-35768548

RESUMEN

The risk profiles of post-acute sequelae of COVID-19 (PASC) have not been well characterized in multi-national settings with appropriate controls. We leveraged electronic health record (EHR) data from 277 international hospitals representing 414,602 patients with COVID-19, 2.3 million control patients without COVID-19 in the inpatient and outpatient settings, and over 221 million diagnosis codes to systematically identify new-onset conditions enriched among patients with COVID-19 during the post-acute period. Compared to inpatient controls, inpatient COVID-19 cases were at significant risk for angina pectoris (RR 1.30, 95% CI 1.09-1.55), heart failure (RR 1.22, 95% CI 1.10-1.35), cognitive dysfunctions (RR 1.18, 95% CI 1.07-1.31), and fatigue (RR 1.18, 95% CI 1.07-1.30). Relative to outpatient controls, outpatient COVID-19 cases were at risk for pulmonary embolism (RR 2.10, 95% CI 1.58-2.76), venous embolism (RR 1.34, 95% CI 1.17-1.54), atrial fibrillation (RR 1.30, 95% CI 1.13-1.50), type 2 diabetes (RR 1.26, 95% CI 1.16-1.36) and vitamin D deficiency (RR 1.19, 95% CI 1.09-1.30). Outpatient COVID-19 cases were also at risk for loss of smell and taste (RR 2.42, 95% CI 1.90-3.06), inflammatory neuropathy (RR 1.66, 95% CI 1.21-2.27), and cognitive dysfunction (RR 1.18, 95% CI 1.04-1.33). The incidence of post-acute cardiovascular and pulmonary conditions decreased across time among inpatient cases while the incidence of cardiovascular, digestive, and metabolic conditions increased among outpatient cases. Our study, based on a federated international network, systematically identified robust conditions associated with PASC compared to control groups, underscoring the multifaceted cardiovascular and neurological phenotype profiles of PASC.

11.
NPJ Digit Med ; 5(1): 74, 2022 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-35697747

RESUMEN

Given the growing number of prediction algorithms developed to predict COVID-19 mortality, we evaluated the transportability of a mortality prediction algorithm using a multi-national network of healthcare systems. We predicted COVID-19 mortality using baseline commonly measured laboratory values and standard demographic and clinical covariates across healthcare systems, countries, and continents. Specifically, we trained a Cox regression model with nine measured laboratory test values, standard demographics at admission, and comorbidity burden pre-admission. These models were compared at site, country, and continent level. Of the 39,969 hospitalized patients with COVID-19 (68.6% male), 5717 (14.3%) died. In the Cox model, age, albumin, AST, creatine, CRP, and white blood cell count are most predictive of mortality. The baseline covariates are more predictive of mortality during the early days of COVID-19 hospitalization. Models trained at healthcare systems with larger cohort size largely retain good transportability performance when porting to different sites. The combination of routine laboratory test values at admission along with basic demographic features can predict mortality in patients hospitalized with COVID-19. Importantly, this potentially deployable model differs from prior work by demonstrating not only consistent performance but also reliable transportability across healthcare systems in the US and Europe, highlighting the generalizability of this model and the overall approach.

12.
BMJ Open ; 12(6): e057725, 2022 06 23.
Artículo en Inglés | MEDLINE | ID: mdl-35738646

RESUMEN

OBJECTIVE: To assess changes in international mortality rates and laboratory recovery rates during hospitalisation for patients hospitalised with SARS-CoV-2 between the first wave (1 March to 30 June 2020) and the second wave (1 July 2020 to 31 January 2021) of the COVID-19 pandemic. DESIGN, SETTING AND PARTICIPANTS: This is a retrospective cohort study of 83 178 hospitalised patients admitted between 7 days before or 14 days after PCR-confirmed SARS-CoV-2 infection within the Consortium for Clinical Characterization of COVID-19 by Electronic Health Record, an international multihealthcare system collaborative of 288 hospitals in the USA and Europe. The laboratory recovery rates and mortality rates over time were compared between the two waves of the pandemic. PRIMARY AND SECONDARY OUTCOME MEASURES: The primary outcome was all-cause mortality rate within 28 days after hospitalisation stratified by predicted low, medium and high mortality risk at baseline. The secondary outcome was the average rate of change in laboratory values during the first week of hospitalisation. RESULTS: Baseline Charlson Comorbidity Index and laboratory values at admission were not significantly different between the first and second waves. The improvement in laboratory values over time was faster in the second wave compared with the first. The average C reactive protein rate of change was -4.72 mg/dL vs -4.14 mg/dL per day (p=0.05). The mortality rates within each risk category significantly decreased over time, with the most substantial decrease in the high-risk group (42.3% in March-April 2020 vs 30.8% in November 2020 to January 2021, p<0.001) and a moderate decrease in the intermediate-risk group (21.5% in March-April 2020 vs 14.3% in November 2020 to January 2021, p<0.001). CONCLUSIONS: Admission profiles of patients hospitalised with SARS-CoV-2 infection did not differ greatly between the first and second waves of the pandemic, but there were notable differences in laboratory improvement rates during hospitalisation. Mortality risks among patients with similar risk profiles decreased over the course of the pandemic. The improvement in laboratory values and mortality risk was consistent across multiple countries.


Asunto(s)
COVID-19 , Pandemias , Hospitalización , Humanos , Estudios Retrospectivos , SARS-CoV-2
13.
IEEE Trans Vis Comput Graph ; 28(1): 140-150, 2022 01.
Artículo en Inglés | MEDLINE | ID: mdl-34596551

RESUMEN

The combination of diverse data types and analysis tasks in genomics has resulted in the development of a wide range of visualization techniques and tools. However, most existing tools are tailored to a specific problem or data type and offer limited customization, making it challenging to optimize visualizations for new analysis tasks or datasets. To address this challenge, we designed Gosling-a grammar for interactive and scalable genomics data visualization. Gosling balances expressiveness for comprehensive multi-scale genomics data visualizations with accessibility for domain scientists. Our accompanying JavaScript toolkit called Gosling.js provides scalable and interactive rendering. Gosling.js is built on top of an existing platform for web-based genomics data visualization to further simplify the visualization of common genomics data formats. We demonstrate the expressiveness of the grammar through a variety of real-world examples. Furthermore, we show how Gosling supports the design of novel genomics visualizations. An online editor and examples of Gosling.js, its source code, and documentation are available at https://gosling.js.org.


Asunto(s)
Gráficos por Computador , Visualización de Datos , Animales , Gansos , Genómica , Programas Informáticos
15.
J Med Internet Res ; 23(10): e31400, 2021 10 11.
Artículo en Inglés | MEDLINE | ID: mdl-34533459

RESUMEN

BACKGROUND: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic. OBJECTIVE: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries. We compared hospitalization rates, severe COVID-19 risk, and mean laboratory values between patients hospitalized during the first and second waves of the pandemic. METHODS: Using a federated approach, each participating health care system extracted patient-level clinical data on their first and second wave cohorts and submitted aggregated data to the central site. Data quality control steps were adopted at the central site to correct for implausible values and harmonize units. Statistical analyses were performed by computing individual health care system effect sizes and synthesizing these using random effect meta-analyses to account for heterogeneity. We focused the laboratory analysis on C-reactive protein (CRP), ferritin, fibrinogen, procalcitonin, D-dimer, and creatinine based on their reported associations with severe COVID-19. RESULTS: Data were available for 79,613 patients, of which 32,467 were hospitalized in the first wave and 47,146 in the second wave. The prevalence of male patients and patients aged 50 to 69 years decreased significantly between the first and second waves. Patients hospitalized in the second wave had a 9.9% reduction in the risk of severe COVID-19 compared to patients hospitalized in the first wave (95% CI 8.5%-11.3%). Demographic subgroup analyses indicated that patients aged 26 to 49 years and 50 to 69 years; male and female patients; and black patients had significantly lower risk for severe disease in the second wave than in the first wave. At admission, the mean values of CRP were significantly lower in the second wave than in the first wave. On the seventh hospital day, the mean values of CRP, ferritin, fibrinogen, and procalcitonin were significantly lower in the second wave than in the first wave. In general, countries exhibited variable changes in laboratory testing rates from the first to the second wave. At admission, there was a significantly higher testing rate for D-dimer in France, Germany, and Spain. CONCLUSIONS: Patients hospitalized in the second wave were at significantly lower risk for severe COVID-19. This corresponded to mean laboratory values in the second wave that were more likely to be in typical physiological ranges on the seventh hospital day compared to the first wave. Our federated approach demonstrated the feasibility and power of harmonizing heterogeneous EHR data from multiple international health care systems to rapidly conduct large-scale studies to characterize how COVID-19 clinical trajectories evolve.


Asunto(s)
COVID-19 , Pandemias , Adulto , Anciano , Femenino , Hospitalización , Hospitales , Humanos , Masculino , Persona de Mediana Edad , Estudios Retrospectivos , SARS-CoV-2
16.
Methods ; 124: 78-88, 2017 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-28600227

RESUMEN

In this paper, we present miRTarVis+, a Web-based interactive visual analytics tool for miRNA target predictions and integrative analyses of multiple prediction results. Various microRNA (miRNA) target prediction algorithms have been developed to improve sequence-based miRNA target prediction by exploiting miRNA-mRNA expression profile data. There are also a few analytics tools to help researchers predict targets of miRNAs. However, there still is a need for improving the performance for miRNA prediction algorithms and more importantly for interactive visualization tools for an integrative analysis of multiple prediction results. miRTarVis+ has an intuitive interface to support the analysis pipeline of load, filter, predict, and visualize. It can predict targets of miRNA by adopting Bayesian inference and maximal information-based nonparametric exploration (MINE) analyses as well as conventional correlation and mutual information analyses. miRTarVis+ supports an integrative analysis of multiple prediction results by providing an overview of multiple prediction results and then allowing users to examine a selected miRNA-mRNA network in an interactive treemap and node-link diagram. To evaluate the effectiveness of miRTarVis+, we conducted two case studies using miRNA-mRNA expression profile data of asthma and breast cancer patients and demonstrated that miRTarVis+ helps users more comprehensively analyze targets of miRNA from miRNA-mRNA expression profile data. miRTarVis+ is available at http://hcil.snu.ac.kr/research/mirtarvisplus.


Asunto(s)
Asma/genética , Neoplasias de la Mama/genética , MicroARNs/genética , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos , Interfaz Usuario-Computador , Algoritmos , Asma/diagnóstico , Asma/metabolismo , Asma/patología , Secuencia de Bases , Teorema de Bayes , Sitios de Unión , Neoplasias de la Mama/diagnóstico , Neoplasias de la Mama/metabolismo , Neoplasias de la Mama/patología , Femenino , Perfilación de la Expresión Génica , Regulación de la Expresión Génica , Humanos , Internet , MicroARNs/metabolismo , ARN Mensajero/metabolismo
17.
BMC Bioinformatics ; 16 Suppl 11: S5, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26328893

RESUMEN

BACKGROUND: Though cluster analysis has become a routine analytic task for bioinformatics research, it is still arduous for researchers to assess the quality of a clustering result. To select the best clustering method and its parameters for a dataset, researchers have to run multiple clustering algorithms and compare them. However, such a comparison task with multiple clustering results is cognitively demanding and laborious. RESULTS: In this paper, we present XCluSim, a visual analytics tool that enables users to interactively compare multiple clustering results based on the Visual Information Seeking Mantra. We build a taxonomy for categorizing existing techniques of clustering results visualization in terms of the Gestalt principles of grouping. Using the taxonomy, we choose the most appropriate interactive visualizations for presenting individual clustering results from different types of clustering algorithms. The efficacy of XCluSim is shown through case studies with a bioinformatician. CONCLUSIONS: Compared to other relevant tools, XCluSim enables users to compare multiple clustering results in a more scalable manner. Moreover, XCluSim supports diverse clustering algorithms and dedicated visualizations and interactions for different types of clustering results, allowing more effective exploration of details on demand. Through case studies with a bioinformatics researcher, we received positive feedback on the functionalities of XCluSim, including its ability to help identify stably clustered items across multiple clustering results.


Asunto(s)
Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Gráficos por Computador , Programas Informáticos , Genoma Humano , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...